Explainability has been widely stated as a cornerstone of the responsible and trustworthy use of machine learning models. With the ubiquitous use of Deep Neural Network (DNN) models expanding to risk-sensitive and safety-critical domains, many methods have been proposed to explain the decisions of these models. Recent years have also seen concerted efforts that have shown how such explanations can be distorted (attacked) by minor input perturbations. While there have been many surveys that review explainability methods themselves, there has been no effort hitherto to assimilate the different methods and metrics proposed to study the robustness of explanations of DNN models. In this work, we present a comprehensive survey of methods that study, understand, attack, and defend explanations of DNN models. We also present a detailed review of different metrics used to evaluate explanation methods, as well as describe attributional attack and defense methods. We conclude with lessons and take-aways for the community towards ensuring robust explanations of DNN model predictions.
translated by 谷歌翻译
There are multiple scales of abstraction from which we can describe the same image, depending on whether we are focusing on fine-grained details or a more global attribute of the image. In brain mapping, learning to automatically parse images to build representations of both small-scale features (e.g., the presence of cells or blood vessels) and global properties of an image (e.g., which brain region the image comes from) is a crucial and open challenge. However, most existing datasets and benchmarks for neuroanatomy consider only a single downstream task at a time. To bridge this gap, we introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. Our multi-task neuroimaging benchmark (MTNeuro) is built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions. We generated a number of different prediction challenges and evaluated several supervised and self-supervised models for brain-region prediction and pixel-level semantic segmentation of microstructures. Our experiments not only highlight the rich heterogeneity of this dataset, but also provide insights into how self-supervised approaches can be used to learn representations that capture multiple attributes of a single image and perform well on a variety of downstream tasks. Datasets, code, and pre-trained baseline models are provided at: https://mtneuro.github.io/ .
translated by 谷歌翻译
Attention mechanisms form a core component of several successful deep learning architectures, and are based on one key idea: ''The output depends only on a small (but unknown) segment of the input.'' In several practical applications like image captioning and language translation, this is mostly true. In trained models with an attention mechanism, the outputs of an intermediate module that encodes the segment of input responsible for the output is often used as a way to peek into the `reasoning` of the network. We make such a notion more precise for a variant of the classification problem that we term selective dependence classification (SDC) when used with attention model architectures. Under such a setting, we demonstrate various error modes where an attention model can be accurate but fail to be interpretable, and show that such models do occur as a result of training. We illustrate various situations that can accentuate and mitigate this behaviour. Finally, we use our objective definition of interpretability for SDC tasks to evaluate a few attention model learning algorithms designed to encourage sparsity and demonstrate that these algorithms help improve interpretability.
translated by 谷歌翻译
Document summarization aims to create a precise and coherent summary of a text document. Many deep learning summarization models are developed mainly for English, often requiring a large training corpus and efficient pre-trained language models and tools. However, English summarization models for low-resource Indian languages are often limited by rich morphological variation, syntax, and semantic differences. In this paper, we propose GAE-ISumm, an unsupervised Indic summarization model that extracts summaries from text documents. In particular, our proposed model, GAE-ISumm uses Graph Autoencoder (GAE) to learn text representations and a document summary jointly. We also provide a manually-annotated Telugu summarization dataset TELSUM, to experiment with our model GAE-ISumm. Further, we experiment with the most publicly available Indian language summarization datasets to investigate the effectiveness of GAE-ISumm on other Indian languages. Our experiments of GAE-ISumm in seven languages make the following observations: (i) it is competitive or better than state-of-the-art results on all datasets, (ii) it reports benchmark results on TELSUM, and (iii) the inclusion of positional and cluster information in the proposed model improved the performance of summaries.
translated by 谷歌翻译
Information overloading requires the need for summarizers to extract salient information from the text. Currently, there is an overload of dialogue data due to the rise of virtual communication platforms. The rise of Covid-19 has led people to rely on online communication platforms like Zoom, Slack, Microsoft Teams, Discord, etc. to conduct their company meetings. Instead of going through the entire meeting transcripts, people can use meeting summarizers to select useful data. Nevertheless, there is a lack of comprehensive surveys in the field of meeting summarizers. In this survey, we aim to cover recent meeting summarization techniques. Our survey offers a general overview of text summarization along with datasets and evaluation metrics for meeting summarization. We also provide the performance of each summarizer on a leaderboard. We conclude our survey with different challenges in this domain and potential research opportunities for future researchers.
translated by 谷歌翻译
The necessity of data driven decisions in healthcare strategy formulation is rapidly increasing. A reliable framework which helps identify factors impacting a Healthcare Provider Facility or a Hospital (from here on termed as Facility) Market Share is of key importance. This pilot study aims at developing a data driven Machine Learning - Regression framework which aids strategists in formulating key decisions to improve the Facilitys Market Share which in turn impacts in improving the quality of healthcare services. The US (United States) healthcare business is chosen for the study; and the data spanning across 60 key Facilities in Washington State and about 3 years of historical data is considered. In the current analysis Market Share is termed as the ratio of facility encounters to the total encounters among the group of potential competitor facilities. The current study proposes a novel two-pronged approach of competitor identification and regression approach to evaluate and predict market share, respectively. Leveraged model agnostic technique, SHAP, to quantify the relative importance of features impacting the market share. The proposed method to identify pool of competitors in current analysis, develops Directed Acyclic Graphs (DAGs), feature level word vectors and evaluates the key connected components at facility level. This technique is robust since its data driven which minimizes the bias from empirical techniques. Post identifying the set of competitors among facilities, developed Regression model to predict the Market share. For relative quantification of features at a facility level, incorporated SHAP a model agnostic explainer. This helped to identify and rank the attributes at each facility which impacts the market share.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
不断增加的材料科学文章使得很难从已发表的文献中推断化学结构 - 培训关系。我们使用自然语言处理(NLP)方法从聚合物文献的摘要中自动提取材料属性数据。作为我们管道的组成部分,我们使用240万材料科学摘要培训了一种语言模型的材料,该材料模型在用作文本编码器时,在五分之三命名实体识别数据集中的其他基线模型都优于其他基线模型。使用此管道,我们在60小时内从约130,000个摘要中获得了约300,000个物质记录。分析了提取的数据,分析了各种应用,例如燃料电池,超级电容器和聚合物太阳能电池,以恢复非平凡的见解。通过我们的管道提取的数据可通过https://polymerscholar.org的Web平台提供,该数据可方便地定位摘要中记录的材料属性数据。这项工作证明了自动管道的可行性,该管道从已发布的文献开始,并以一组完整的提取物质属性信息结束。
translated by 谷歌翻译
批次归一化被广泛用于深度学习以使中间激活归一化。深层网络臭名昭著地增加了训练的复杂性,要​​求仔细的体重初始化,需要较低的学习率等。这些问题已通过批归一化解决(\ textbf {bn})来解决,通过将激活的输入归功于零平均值和单位标准偏差。使培训过程的批归归量化部分显着加速了非常深网络的训练过程。一个新的研究领域正在进行研究\ textbf {bn}成功背后的确切理论解释。这些理论见解中的大多数试图通过将其对优化,体重量表不变性和正则化的影响来解释\ textbf {bn}的好处。尽管\ textbf {bn}在加速概括方面取得了不可否认的成功,但分析的差距将\ textbf {bn}与正则化参数的效果相关联。本文旨在通过\ textbf {bn}对正则化参数的数据依赖性自动调整,并具有分析证明。我们已将\ textbf {bn}提出为对非 - \ textbf {bn}权重的约束优化,通过该优化,我们通过它演示其数据统计信息依赖于正则化参数的自动调整。我们还为其在嘈杂的输入方案下的行为提供了分析证明,该方案揭示了正则化参数的信号与噪声调整。我们还通过MNIST数据集实验的经验结果证实了我们的主张。
translated by 谷歌翻译
Twenty20板球,有时是二十20,经常缩写为T20,是板球的一小部分。在一场二十二十比赛中,两支球员组成的两支球队都有一局,最多仅限20分。这个版本的板球尤其是不可预测的,这是它最近在近期越来越受欢迎的原因之一。但是,在本文中,我们尝试了四种不同的方法来预测T20板球比赛的结果。具体来说,我们要考虑:以前的竞争团队参与者的绩效统计数据,从知名的板球统计网站获得的球员的评分,以相似的性能统计数据和基于ELO基于ELO的方法来汇率玩家。我们通过使用逻辑回归,支持向量机,贝叶斯网络,决策树,随机森林来比较每种方法的性能。
translated by 谷歌翻译